5 - Artificial Intelligence II [ID:57505]
50 von 619 angezeigt

Okay, so welcome back. So we're still looking at model-based agents that are based on probability

distributions as world representation and we're still trying to design agents who have a good

model of the world without having to pay a huge computational cost. We've learned about

probability theory a little bit and we've learned about various techniques for probabilistic reasoning

and now we're kind of trying to get a handle on and the first kind of thing we were looking at

was these naive Bayes models. Situations where we had a couple of random variables in this example,

three of them, where we have one cause here cavity and a bunch n many where n can be two as in this

example or 300,000 as in this classification example of variables that are conditionally

independent given the cause. Okay, so that's what a naive Bayes model is and we're interested in

using this situation as an efficient way of representing the full joint probability distribution

and if you think about this situation here we have three random variables and all of them are

Boolean so we have two to the three entries in our full joint probability distribution which is fine

for an example on the slides of a lecture because we want things to be small but if we think about

this other example where we had a classification say of length 10 of newspaper articles you remember

those and you have words counts which are also bigger random variables then we have something

given say 300,000 words in English we have a 10 to the 300,000 full joint probability distribution

and that's not something we want to have to store in our little laptops or even in a server farm

it's going to be very expensive so the name of the game is not going by the theoretical tool the

mathematical tool of the full joint probability distribution but computing those things that we

need when we need them and for this one it's very easy so we looked at these naive Bayes models and

the upshot the most important thing is that somewhere in these computations we have to look

at the chain rule and that is a good thing because it allows us to have all these conjunctions of

random variables and express them and compute them by conditional probabilities which we're much

more likely to have and the wonderful thing is that we never need these long 300,000 arguments

conditional probabilities in the big example but we only need one of them why because if in a

in a Bayesian naive Bayes model we have the cause and lots of evidence variables we only

have P of C of E this one or am I getting it wrong no it's exactly what what I wanted to we only need

one cause here right that makes every individual term much smaller and we are left by a relatively

simple still 300,000 long multiplication that still work but it's a lot much less work and

especially it needs a lot less data so that's the situation here and we've basically worked

through this and discovered a couple of things we've discovered that if we actually have things

that we can observe under these evidence variables and others that are unknown we can actually get

rid of the unknown but by basically summing up over all of the possibilities that's called

marginalization we kind of marginalize the things we don't we don't know and the next thing for

these unknowns that's how it works and the next thing that is always magic to me is this

normalization step we want to end up with a probability distribution we have to because

the left hand side wants to be one so we're ending up with something that doesn't sum up to one but

we know it has to sum up to one and we know that there's this funny divisor which is always the

same and that's basically we interpret that as an as a normalization constant that we can actually

compute from the factor of which we're off from one that needs to be the result so we can kind of

put a lot of work into this into this normalization constant and it basically solves some of our

problems by magic at least that's what it feels like to me and if you really want to understand

what's going on take your fingers and go through all of this and you'll see off this works and if

you're like me then you're immediately forgetting why this works and right so I can retain the

understanding for a very limited time but I've convinced myself that it actually works and I do

it before every lecture and then I'm always scared that I get it wrong so it's a little bit of magic

okay good and we looked at it in the dentistry example and looked at this classification example

which was essentially only an example to show you that we often have loads of variables hundreds

of thousands of them and this still works not a problem and solves real-world problems okay so

we want to look at unless there are any questions are there questions so I would like to look at

Teil einer Videoserie :

Zugänglich über

Offener Zugang

Dauer

01:31:37 Min

Aufnahmedatum

2025-05-07

Hochgeladen am

2025-05-09 03:09:05

Sprache

en-US

Einbetten
Wordpress FAU Plugin
iFrame
Teilen